The goal of this tutorial is to explore how to fit hidden Markov models
(HMMs) to movement data. To do so, we will investigate the R package
momentuHMM. This package builds on a slightly older package,
moveHMM, that was developed by Théo Michelot, Roland Langrock, and
Toby Patterson, see associated paper:
https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.12578.
momentuHMM, was developed by Brett McClintock and Théo Michelot.
momentuHMM has new features such as allowing for more data streams,
inclusion of covariates as raster layers, and much more, see associated
paper:
https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.12995.
The primary learning objectives in this first tutorial are to:
momentuHMM (\(\approx 30\) min)First, let’s load the packages that we will need to complete the analyses. Of course you need to have them installed first.
While the raster package is being phased out in favour of the new terra package, momentuHMM still relies on the raster package to extract covariates. Therefore, for this tutorial we will still use raster, but do start working with terra in your work where possible. Make sure to set working directory to “CANSSI_OTN_HMM_2023” of the HMM workshop folder:
One of the main features of the data used with HMMs is that locations are taken at regular time steps and that there are negligible position error. For example, HMMs are adequate to use on GPS tags that take locations at a set temporal frequency (e.g., every \(2\) hrs). Without reprocessing, HMMs are not particularly good for irregular times series of locations or locations with large measurement error (e.g., you would need to preprocess Argos data before applying HMMs).
Unfortunately, movement of aquatic species is rarely taken at regular time intervals and without measurement error. For example, the data set we will work on, is the movement track of three narwhals tagged with a Fastlock GPS tag for two weeks, provided by Dr. Marianne Marcoux.
First, let’s import the narwhal movement data. For simplicity, we only examine the GPS data from two weeks in August 2017 for three individuals.
The function ymd_hms() transforms dates stored as character or numeric vectors to POSIXct objects, that are useful and commonly used in ecology.
The data we obtain is often quite messy with some records missing information and other records duplicated. We can filter records to keep only complete location data using !is.na(x) & !is.na(y). To remove duplicate records (same time, location, and location class), we will use the lag function from dplyr, which will use the value from the previous row so that we can compare to the current row.
tracks_gps <- tracks_gps %>%
# remove missing locations
filter(!is.na(x) & !is.na(y),
# remove identical records
!(time == lag(time) & x == lag(x) & y == lag(y) & loc_class == lag(loc_class)))For the main tutorial, since we only use the fastloc GPS data, we don’t have to deal with location error.
Now we will plot the tracks with a new package ggOceanMaps that will show the tracks on the map of the Earth. To do so, we need to choose the limits of the map. We set them to the limits in Longitude and Latitude of our data (bboxbelow). This package is convenient since it deals with Longitude and Latitude, and there is no need to transform our dataset (other methods usually requires to transform the dataset into an sf object). In the crs argument, we define the coordinate reference system of the original data (in this case WGS84, which is defined by the EPSG code 4326).
#Transform the dataset into a data.frame with longitude, latitude and ID of the animal
df <- cbind.data.frame(Lon = c(tracks_gps$x),Lat = c(tracks_gps$y), ID = c(tracks_gps$ID))
#Define the geographical limits (box) of the background map: c(min_lon,max_lon,min_lat,max_lat)
bbox <-c(min(tracks_gps$x),max(tracks_gps$x),min(tracks_gps$y),max(tracks_gps$y))
#Plot the data, with background map
basemap(limits = c(bbox)) +
geom_spatial_path(data=df, aes(x = Lon, y = Lat,colour=factor(ID)), crs = 4326)+
ggtitle("Movement data of the narwhals")
We can also add the bathymetry to the map, to get a better sense of the movement by adding the argument bathymetry= TRUE in the basemap function. To do so, we will need to dezoom.
df <- cbind.data.frame(Lon = c(tracks_gps$x),Lat = c(tracks_gps$y), ID = c(tracks_gps$ID))
bbox <-c(min(tracks_gps$x)-5,max(tracks_gps$x)+5,min(tracks_gps$y)-1,max(tracks_gps$y)+1)
#Plot the data with bathymetry, replace the code by the correct argument
basemap(limits = c(bbox),bathymetry= TRUE) +
geom_spatial_path(data=df, aes(x = Lon, y = Lat,colour=factor(ID)), crs = 4326)+
ggtitle("Movement data of the narwhals")The classic HMM assumes the observations are collected in discrete time and that there is no missing data in the predictor variables. There are two key decisions we must make, (1) the temporal resolution to use, and (2) how to address data gaps. The desired resolution depends predominantly on the biological question you are asking as different behaviours and biological processes occur at different spatial and temporal scales (e.g., seasonal migration, daily movement between foraging and resting grounds, and fine scale foraging decisions). Generally, higher resolution data is preferred as it has more information. However, it is possible to have too-high of a resolution wherein information from fine-scale variability drowns out the signal from coarse-scale patterns of interest (e.g., seasonal migration). In this case, we will be linking the movement data with high resolution (75 s) dive data to look at finer-scale behaviours (on the order of a few hours). My rule of thumb, is that you want 3-50 data points per behaviour. For behaviours spanning several hours, that roughly corresponds to a desired resolution between 2 min and 60 min.
First, let’s calculate the time difference between successive records using difftime and lead (compares current row to following row) and place these values in a new column called dt. Note that the time difference is in minutes (units = "mins"). For the last record of each individual (i.e., when ID != lead(ID)), we will set the value to NA.
# Calculate time difference between locations
tracks_gps <- tracks_gps %>%
mutate(dt = ifelse(ID == lead(ID), # If next data row is same individual
# calculate time difference
difftime(lead(time), time, units = "mins"), NA))Let’s see what resolutions may be possible in the data by looking at the most frequent time gaps.
# Zoom in on short intervals
hist(tracks_gps$dt, 1000, main = NA, xlab = "Time difference (min)", xlim = c(0,100))# identify the most frequent dt
tracks_gps %>%
{table(.$dt)} %>%
sort(decreasing = TRUE) %>%
head()
##
## 10 11 12 22 9 13
## 400 145 90 64 63 63
We see that the most frequent time gap is \(10\) min, followed by \(11\), \(12\), \(22\), \(9\) and \(13\) min. We also see the majority of the gaps are \(< 60\) min, however some are in excess of \(600\) min. Because HMMs assume observations taken at regular time intervals, finer resolutions will contain more data gaps that would need to be interpolated. Frequent and large data gaps can be difficult to handle, especially as the number of missing data points approaches or exceeds the existing data. Let’s examine the potential data structure at different resolutions for the different animals.
We first create a function that approximates the proportion of missing locations we would have for a given resolution.# Make function to estimate proportion of missing locations
p_na <- function(time, res) {
time <- round_date(time, paste(res,"min")) # round to nearest resolution
time <- na.omit(time[time != lead(time)]) # remove duplicate time
# calculate maximum number of locations
max_n_loc <- length(seq(time[1], tail(time, 1) + res*60,
by = as.difftime(res, units = "mins")))
n_NA <- max_n_loc - (length(time)+1)
n_NA / max_n_loc
}We can now use this function to look at the proportion of NAs we would get with 10, 20, 30, and 60 min resolution.
# summarise track dt
tracks_gps %>%
group_by(ID) %>%
summarise(p_NA_10m = p_na(time, 10), # 10 min
p_NA_20m = p_na(time, 20), # 20 min
p_NA_30m = p_na(time, 30), # 30 min
p_NA_60m = p_na(time, 60)) %>% # 60 min
# return formatted table
kable(digits = 3, col.names = c("Narwhal ID", paste(c(10,20,30,60), "m"))) %>%
kable_styling() %>%
add_header_above(c("", "Resolution" = 4))| Narwhal ID | 10 m | 20 m | 30 m | 60 m |
|---|---|---|---|---|
| T172062 | 0.486 | 0.236 | 0.191 | 0.152 |
| T172064 | 0.502 | 0.203 | 0.120 | 0.074 |
| T172066 | 0.488 | 0.230 | 0.185 | 0.160 |
Here we see that the \(10\) min interval, around \(50\%\) of the locations would be missing. This is a lot, but if we choose finer resolutions, simulated data would outweigh real data, and may bias the results. For this tutorial, we will use a \(10\) min resolution.
There are several ways to deal with data gaps, and we will address two:
One strategy to address data gaps is to leaving the data streams (i.e., step length and turning angle) as NAs. For my own work, I have typically voided sections with \(>6\) missing locations. The maximum size of a gap to allow depends on the frequency of the missing data, frequency of locations, study species, and behaviours of interest.
Voiding gaps is particularly useful for moderate and large gaps, however, it may often be the best option for small gaps as well. The only catch is that although HMMs can allow for missing data in the state-dependent observation data (e.g., step length and turning angle), HMMs cannot have missing data if there are covariates on the transition probability (for example, if there was an effect of bathymetry on state probability).
In this case, we will void the data streams from locations predicting in gaps \(>60\) min. First, we will identify which steps need to be voided, then we will prepare the data and void the estimated step and angle data streams. We will do this again later in the tutorial, so we will wrap it into a function called prepData_NAGaps. The function will output a momentuHMMData object that can directly be fit using fitHMM.
This methods will not work on data with multiple individuals. To get around this, we can split the data into a list where each element of the list is the location data for each ID. We can use the R apply family functions to apply functions to each animal separately.
# convert tracks back to data.frame with xy coordinates
tracks_gps_ls <- tracks_gps %>%
split(., .$ID) # split into list# define function to replace step and turn angle of large gaps with NA
NA_gaps <- function(tracks, times){
# rows where location is within a large gap
rows <- which(
rowSums(apply(times, 1, function(X, tracks){
dplyr::between(tracks,
as.POSIXct(X[1], tz = "UTC"),
as.POSIXct(X[2], tz = "UTC"))
}, tracks$time))==1)
tracks$step[rows] <- NA
tracks$angle[rows] <- NA
return(tracks)
}
# define function to identify and void gaps
prepData_NAGaps <- function(track_list, tracks_crw, res, max_gap, ...){
# for each ID, identify which rows have gaps >= max_gap
gaps_ls_rows <- lapply(track_list, function(x){
which(difftime(lead(x$time), x$time, units = "mins") >= max_gap)
})
# create sequence of times for each track from gaps >= 60 min
gap_ls <- mapply(FUN = function(track, gaps){
# identify start and end date of each gap
gaps_ls_srt_end <- list(start = ceiling_date(track$time[gaps], paste(res, "min")),
end = floor_date(track$time[gaps+1], paste(res, "min")))
# combine into single vector for each track
data.frame(start = gaps_ls_srt_end$start, end = gaps_ls_srt_end$end)
},
track_list, gaps_ls_rows, SIMPLIFY = FALSE)
# prep data and list by ID
prep_tracks <- prepData(tracks_crw, ...) %>%
{split(., .$ID)}
# Interpolate the location at the times from the sequence
mapply(FUN = NA_gaps, prep_tracks, gap_ls,
SIMPLIFY = FALSE) %>%
do.call("rbind", .) # convert list of tracks back to a single data.frame
}
prep_tracks_gps_crw_NAgaps <- prepData_NAGaps(tracks_gps_ls, tracks_gps_crw,
10, 60, type = "UTM")
# get time of day covariate (which will be used later)
prep_tracks_gps_crw_NAgaps$tod <- hour(prep_tracks_gps_crw_NAgaps$time) + minute(prep_tracks_gps_crw_NAgaps$time)/60
head(prep_tracks_gps_crw_NAgaps)## ID step angle time x y
## T172062.1 T172062 253.5446 NA 2017-08-08 00:20:00 -285113.8 8176359
## T172062.2 T172062 231.0461 0.7415044 2017-08-08 00:30:00 -285315.1 8176513
## T172062.3 T172062 451.1109 0.8454366 2017-08-08 00:40:00 -285545.3 8176493
## T172062.4 T172062 689.7662 0.1654594 2017-08-08 00:50:00 -285813.7 8176130
## T172062.5 T172062 506.6135 0.2621016 2017-08-08 01:00:00 -286127.2 8175516
## T172062.6 T172062 241.1625 1.0919775 2017-08-08 01:10:00 -286232.7 8175020
## tod
## T172062.1 0.3333333
## T172062.2 0.5000000
## T172062.3 0.6666667
## T172062.4 0.8333333
## T172062.5 1.0000000
## T172062.6 1.1666667
The main idea is to use tracks_gps_ls to locate the large gaps with the NA_gaps function and then
fill them with NAs in the tracks_gps_crw dataset. We are using the tracks_gps_crw dataset because smaller gaps are filled and the time-resolution is already set.
prep_tracks_gps_crw_NAgaps.
Another strategy to deal with larger gaps is to segment the tracks with a new individual ID. This may be particularly appropriate for gaps where we may reasonably expect that the underlying states are effectively independent of one another. Specifically, we may ask, over what period of time does the behaviour of the animal affect the subsequent behaviour. In this case, we may expect that the behaviour of a narwhal depends on the behaviour over the proceeding several hours, however is independent after 24 hours. We can split the tracks for gaps larger than a predetermined threshold by iterating the ID column. For each segmented path, it is then possible to void the large gaps and fit a correlated random walks on smaller gaps, as we did before.
In the next part of this tutorial, we will explore how to fit hidden Markov models
(HMMs) to the prep_tracks_gps_crw_NAgaps dataset in the R package momentuHMM >.
The primary learning objectives of this part are:
momentuHMM
Our data is ready to use, we can fit the HMM. For this we use
the function fitHMM. This is where we need to make many of our
modelling decisions and most of these decisions will be associated with
one of the arguments of the function. The minimum arguments fitHMM
requires are: fitHMM(data, nbState, dist, Par0).
First, when we fit a HMM, we need to decide the number of behavioural states we are going to model. To start simple, we will only use two behavioural states. These could be, for example, representing one behaviour with fast and directed movement (e.g., travelling) and one behaviour with a more slow and tortuous movement (e.g., residing). This mean that the argument nbStates will be set to \(2\).
Second, we need to decide whether we make the transition probabilities between the behavioural states dependent on covariates. Here, we will start simple and we will not use covariates.
Third, we need to select the distribution we will use to model the step lengths and the one we will use to model the turning angles. For now, we will use the gamma distribution for the step length and the von Mises for the turning angles. These are commonly used distributions, to model movement data. This means that the argument dist will be set to: list(step=“gamma”, angle=“vm”). Note that dist should be a list with an item for each data stream columns in the data that we are modelling (so here the column step and angle). The gamma distribution is strictly positive (i.e., it does not allow for \(0\)s). If you have step lengths that are exactly zero in your data set, you need to use zero-inflated distributions. But in this case, we have no zeros.
By default, fitHMM will set the argument estAngleMean to NULL, which means that we assume that the mean angle is \(0\) for both behaviours (i.e., the animal has a tendency to continue in the same direction) and that only the angle concentration parameters differ. The concentration parameters control the extent to which the animal continues forward versus turn. Doing so reduces the number of parameters to estimate. These are all very important decisions that you must make when you construct your model. In addition, we need to specify initial values for the parameters to estimate. The HMMs are fit using maximum likelihood estimate (MLE) with a numerical optimizer. An unfortunate aspect of fitting models using numerical optimizers, is that, to be able to explore the parameter space and find the best parameter values for the model, the optimizer needs to start somewhere. You need to decide where it starts. Unfortunately, choosing bad starting values can result in parameter estimates that are not the MLE, but just a local maximum. To choose your initial parameter you can take a quick peak at the data (e.g., using the plots below) and use some general biological information. For example, it’s common for animals to have one behaviour with long step lengths and small turning angles and one behaviour with short step lengths and larger turning angles. From the plots below (note that we can simply use plot on our data), it looks like the animal has step lengths that are close to \(100\) and \(600\). There could be a third behaviour with larger step lengths. But for now we will consider only two.
plot(prep_tracks_gps_crw_NAgaps$step ~
prep_tracks_gps_crw_NAgaps$time, ty = "l", ylab = "Step length",
xlab = "Date", las = 1)
abline(h = 100, col = rgb(1, 0, 0, 0.5))
abline(h = 600, col = rgb(1, 0, 0, 0.5))So let’s choose \(100\) and \(600\) for the means step lengths and use half those values for the standard deviations. The turning angles are either very close to \(0\) or pretty spread from \(-\pi/2\) to \(\pi/2\). High concentration parameter values (\(\kappa\), said kappa) mean that the animal has a tendency to move in the same direction. Values close to 0 mean that the turning angle distribution is almost uniform (the animal turns in all directions). Note that \(\kappa\) cannot be exactly 0, so let’s choose 0.1 and 1.
# define states (optional)
stateNames <- c("resident", "travel")
nbState <- length(stateNames)
# define distribution to use for each data stream
dist <- list(step = "gamma", angle = "vm")
# Setting up the starting values
mu0 <- c(100, 600) # Mean step length
sigma0 <- mu0/2 # SD of the step length
kappa0 <- c(0.1, 1) # Turn angle concentration parameter
# combine starting parameters
Par0 <- list(step = c(mu0, sigma0), angle = kappa0)Ok, were are ready. Let’s fit the HMM and look at the parameter estimates
# Fit a 2 state HMM
HMM_gps_crw_NAgaps <- fitHMM(prep_tracks_gps_crw_NAgaps,
stateNames = stateNames, nbState = 2,
dist = dist, Par0 = Par0)
# Let's look at parameter estimates
HMM_gps_crw_NAgaps## Value of the maximum log-likelihood: -17884.91
##
##
## step parameters:
## ----------------
## resident travel
## mean 282.0578 754.6505
## sd 153.8772 251.0847
##
## angle parameters:
## -----------------
## resident travel
## mean 0.000000 0.00000
## concentration 2.277153 18.01728
##
## Regression coeffs for the transition probabilities:
## ---------------------------------------------------
## 1 -> 2 2 -> 1
## (Intercept) -2.601642 -2.563677
##
## Transition probability matrix:
## ------------------------------
## resident travel
## resident 0.93096716 0.06903284
## travel 0.07151298 0.92848702
##
## Initial distribution:
## ---------------------
## resident travel
## 0.3025479 0.6974521
We can also plot them
## Decoding state sequence... DONE
Based on the mean step parameters, it looks like the first behavioural state (resident) has smaller step lengths compared to state 2. This is particularly easy to see in the step histogram (first figure), where the estimated distribution for each state is overlaid on top of the observed step length frequencies. The turning angle distributions of second state (travel) indicates much more direct movement. The concentration parameter is significantly bigger than that of state 1. Looking at the track we can see the estimated transitions between states.
Maybe looking at the state sequence in more detail will help us understand. Actually, we are interested in identifying when an animal is in each of the behavioural states (here when the animal is in state 1 vs state 2), something we call state decoding. To do so, we can use the function viterbi, which uses the Viterbi algorithm to produce the most likely sequence of states according to your fit model and data.
# Apply the Viterbi algorithm using your fited model object
dec_States <- viterbi(HMM_gps_crw_NAgaps)
# Let's look at predicted states of the first 20 time steps
head(dec_States, 20)## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2
## dec_States
## 1 2
## 1476 1446
In many cases, it is more interesting to get the probability of being in each state rather than the most likely state. To do so, you can use the function stateProbs, which returns a matrix of the probability of being in each state for each time step.
# Calculate the probability of being in each state
statep <- stateProbs(HMM_gps_crw_NAgaps)
# Let's look at the state probability matrix
head(statep)## resident travel
## [1,] 0.9944545 5.545466e-03
## [2,] 0.9999782 2.183931e-05
## [3,] 0.9948147 5.185281e-03
## [4,] 0.6298960 3.701040e-01
## [5,] 0.7077568 2.922432e-01
## [6,] 0.9999950 4.965048e-06
We can see here the probability of being in both states for the first \(6\) time steps. Here, the probability of being in state 1 is really high for these steps, but that may not always be the case. Sometimes you might have values closer to \(0.5\), which would indicate that for that time step, it’s hard to identify which state you are in (i.e., which step length and turning angle distributions fit best).
You can plot the results from both of these functions using the functionplotState
## Decoding state sequence... DONE
## Computing state probabilities... DONE
The first thing we need to look into is whether the parameter estimates are reliable. As mentioned above, the initial parameter values can affect the estimating procedures. So it’s always good to check if you get similar results with different starting values. Here, we will make the starting values of the two behavioural states closer to one another.
# Setting up the starting values
mu2 <- c(400, 600) # Mean step length
sigma2 <- mu2/2 # SD of the step length
kappa2 <- c(1, 1) # Turn angle concentration parameter
# combine starting parameters
Par2 <- list(step = c(mu2, sigma2), angle = kappa2)
# Fit the same 2 state HMM
HMM_gps_crw_NAgaps2 <- fitHMM(prep_tracks_gps_crw_NAgaps,
stateNames = stateNames, nbState = 2,
dist = dist, Par0 = Par0)Let’s compare the two results. First let’s look at which of the two has the lowest negative log likelihood (equivalent of highest log likelihood, so closer to the real MLE). Let’s look also at the parameter estimates they each returned.
# Negative log likelihood
c(original = HMM_gps_crw_NAgaps$mod$minimum, new = HMM_gps_crw_NAgaps2$mod$minimum)## original new
## 17884.91 17884.91
## resident travel resident travel
## mean 282.0578 754.6505 282.0578 754.6505
## sd 153.8772 251.0847 153.8772 251.0847
## resident travel resident travel
## mean 0.000000 0.00000 0.000000 0.00000
## concentration 2.277153 18.01728 2.277153 18.01728
## resident travel resident travel
## resident 0.93096716 0.06903284 0.93096716 0.06903284
## travel 0.07151298 0.92848702 0.07151298 0.92848702
Looks like they both returned very close values for everything. So that’s good! The function fitHMM also has the argument retryFits which perturbs the parameter estimates and retries fitting the model. We use it when we think the estimates could be local maxima. Since the optimization of HMMs depend on the starting values chosen, it is always good to explore multiple initial values and then choose the best estimates among them (which is done by retryFits). The argument is used to indicate the number of times you want perturb the parameters and retry fitting the model (you can choose the size of the perturbation by setting the argument retrySD).
# Fit the same 2-state HMM with retryFits
# This is a random pertubation, so setting the seed to get the same results
set.seed(29)
HMM_RF <- fitHMM(prep_tracks_gps_crw_NAgaps,
nbState = 2,
dist=list(step="gamma", angle="vm"),
Par0 = list(step=c(mu2, sigma2), angle=kappa2),
retryFits=10)## Attempting to improve fit using random perturbation. Press 'esc' to force exit from 'fitHMM'
##
Attempt 1 of 10 -- current log-likelihood value: -17884.91
Attempt 2 of 10 -- current log-likelihood value: -17884.91
Attempt 3 of 10 -- current log-likelihood value: -17884.91
Attempt 4 of 10 -- current log-likelihood value: -17884.91
Attempt 5 of 10 -- current log-likelihood value: -17884.91
Attempt 6 of 10 -- current log-likelihood value: -17884.91
Attempt 7 of 10 -- current log-likelihood value: -17884.91
Attempt 8 of 10 -- current log-likelihood value: -17884.91
Attempt 9 of 10 -- current log-likelihood value: -17884.91
Attempt 10 of 10 -- current log-likelihood value: -17884.91
Let’s compare the results again.
# Negative log likelihood
c(original = HMM_gps_crw_NAgaps$mod$minimum, new = HMM_gps_crw_NAgaps2$mod$minimum,
retryFits = HMM_RF$mod$minimum)## original new retryFits
## 17884.91 17884.91 17884.91
# Parameter estimates
cbind(HMM_gps_crw_NAgaps$mle$step, HMM_gps_crw_NAgaps2$mle$step, HMM_RF$mle$step)## resident travel resident travel state 1 state 2
## mean 282.0578 754.6505 282.0578 754.6505 282.0581 754.6507
## sd 153.8772 251.0847 153.8772 251.0847 153.8775 251.0848
## resident travel resident travel state 1 state 2
## mean 0.000000 0.00000 0.000000 0.00000 0.00000 0.00000
## concentration 2.277153 18.01728 2.277153 18.01728 2.27715 18.01732
## resident travel resident travel state 1 state 2
## resident 0.93096716 0.06903284 0.93096716 0.06903284 0.93096679 0.06903321
## travel 0.07151298 0.92848702 0.07151298 0.92848702 0.07151399 0.92848601
Still looking very similar!
momentuHMM has functions to help selecting initial parameters for complex models. However, these functions rely on the user choosing initial parameter values for a simple model like the one we just explored.
We fitted a model to the data, looks like the parameter estimates are reliable, and we can get state probabilities, but is this a good model for our data? Can we use the model results? The best way to investigate model fit is through pseudo-residuals. Pseudo-residuals are a type of model residuals that account for the interdependence of observations. They are calculated for each of the time series (e.g., you will have pseudo-residuals for the step length time series and for the turning angle time series). If the fit model is appropriate, the pseudo-residuals produced by the functions pseudoRes should follow a standard normal distribution. You can look at pseudo-residuals directly via the function plotPR, which plots the pseudo-residual times-series, the qq-plots, and the autocorrelation functions (ACF).
## Computing pseudo-residuals...
Both the qq-plot and the ACF plot indicate that the model does not fit the step length time series particularly well. The ACF indicates that there is severe autocorrelation remaining in the time series, even after accounting for the persistence in the underlying behavioural states.
This could indicate that there are more hidden behavioural states. Let’s try a 3-state HMMs.# Setting up the starting values
mu3 <- c(100, 600, 1000) # Mean step length
sigma3 <- mu3/2 # SD of the step length
kappa3 <- c(0.1, 1, 1) # Turn angle concentration parameter
# combine starting parameters
Par3 <- list(step = c(mu3, sigma3), angle = kappa3)
# Fit the same 3 state HMM
HMM_gps_crw_NAgaps3 <- fitHMM(prep_tracks_gps_crw_NAgaps,
nbState = 3,
dist = dist, Par0 = Par3)
plot(HMM_gps_crw_NAgaps3)## Decoding state sequence... DONE
If we look at the pseudo-residuals
## Computing pseudo-residuals...
It’s better. Not perfect, but less unexplained autocorrelation, especially in the step lengths. If we look at the step length distribution, we can see that we have still our state with really small steps (maybe resting?), a state with steps of medium size (maybe foraging?), and a state with large steps (maybe travelling?). Looking at the track, it does look like the movement in state 2 is more tortuous and in focal areas, while the movement in state \(3\) is very directed and span larger areas.
We can add covariates to our HMM to better understand drivers of behaviour. We can either include covariates on the transition probabilities (i.e., the state process) or in the state-dependent distributions (i.e., the observation process). Respectively, these answer the questions:
Does a covariate have an effect on the probability of switching between states?
Does a covariate have an effect on the movement within each state?
It’s probably most common to include covariates on the transition probabilities in an HMM to assess how spatial, temporal, or demographic factors influence the state-switching dynamics. For example, prey density may increase the probability of switching into a foraging state or states may follow temporal patterns throughout the day. At time \(t\), each transition probability is linked to covariates using a multinomial logit link \[\gamma_{ij}^{(t)} = Pr(S_{t+1} = j \mid S_t = i) = \frac{\exp(\eta_{ij}^{(t)})}{\sum_{k=1}^{K}\exp(\eta_{ik}^{(t)})} \] with the linear predictor for \(P\) covariates \(\{\omega_1^{(t)}, \omega_2^{(t)}, \dots \omega_P^{(t)}\}\) given as \[\eta_{ij}^{(t)} = \begin{cases} \beta^{(ij)}_0 + \sum_{p=1}^{P} \beta_p^{(ij)} \omega_p^{(t)} & \text{if } i \neq j \\ 0 & \text{otherwise.} \end{cases} \] For each transition probability, \(\beta_0^{(ij)}\) is an intercept parameter, and \(\beta_p^{(ij)}\) measures the effect of the \(p\)-th covariate \(\omega_p\).
Here, we will assess if transition probabilties vary as a function of time of day, and we will use a 2-state model for demonstration.The first step is to define our model formula. Since time of day is a cyclic covariate, we must define a relationship that ensures that the transition probabiltiies are similar at the very end and very beginning of the day. We can do this by specify the transition probabilties as a trigonometric (i.e., cyclic) function of time of day (defined as \(\tau\)),
\[\eta_{ij}^{(t)} = \begin{cases}
\beta^{(ij)}_0 + \beta_1^{(ij)} \cos \left(\frac{2 \pi \tau_t}{24}\right) + \beta_2^{(ij)} \sin \left(\frac{2 \pi \tau_t}{24}\right) & \text{if } i \neq j \\
0 & \text{otherwise}
\end{cases} ,\]
where 24 refers for the 24 hour period of interest. This relationhip can conveniently be specified with the cosinor function.
# set new formula
formula <- ~cosinor(tod, period = 24)Next, we need to specify initial parameters and we will use the function getPar0 to do so. This will use the estimated parameters of a previously fitted 2-state model with no covariates. Note it will set the unknown linear predictor parameters to 0, and use the intercept of the previous model.
# set tpm formula and extract initial values from simpler model
Par0_m2 <- getPar0(model = HMM_gps_crw_NAgaps, formula = formula)
Par0_m2## $Par
## $Par$step
## mean_1 mean_2 sd_1 sd_2
## 282.0578 754.6505 153.8772 251.0847
##
## $Par$angle
## concentration_1 concentration_2
## 2.277153 18.017285
##
##
## $beta
## 1 -> 2 2 -> 1
## (Intercept) -2.601642 -2.563677
## cosinorCos(tod, period = 24) 0.000000 0.000000
## cosinorSin(tod, period = 24) 0.000000 0.000000
##
## $delta
## [1] 0.3025479 0.6974521
Then we can fit the model. The formula argument defines the model formula for the transition probabilities, and we need to set the correct initial parameters (i.e., Par0 for the observation parameters and beta0 for the transition probability model).
# fit model
m_tod_tpm <- fitHMM(prep_tracks_gps_crw_NAgaps,
nbStates=2,
dist=dist,
Par0=Par0_m2$Par,
beta0=Par0_m2$beta,
formula=formula)
m_tod_tpm$mle$beta## 1 -> 2 2 -> 1
## (Intercept) -2.569164871 -2.4871784
## cosinorCos(tod, period = 24) 0.182094271 0.2961311
## cosinorSin(tod, period = 24) 0.004107689 -0.1570752
There are several ways to visualise the relationship. If you use the standard plot() function on the fitted HMM, you can visualise the transition probabilities directly. Another option is to plot the stationary state probabilities, which represent how likely the animal is to be in each behaviour at each time of day.
plotStationary(m_tod_tpm, plotCI = TRUE)This section will illustrate how to model mean step length as a function of time of day. Any parameter in the observation model (e.g., mean/sd of the step lengths, or turning angle parameters) can be modelled as a function of a covariate, which allows for the movement behaviour within each discrete state to vary. Here, we will use the same trigonometric function described in the previous section to model mean step length based on the time of day. To do so, we specify DM, which will be a list of formulas for the step length (and possibly other) prameters. For simplicity, we will only model the mean, but it is common to also model the standard deviation with the same relationship. We can derive the initial parameters using the getPar0 function, as described in the previous section.
set.seed(1)
# define step formula and design matrix
DM <- list(step = list(mean = formula, sd = ~1))
# extract initial values from simpler 2-state model with no covariates
Par0_m3 <- getPar0(model = HMM_gps_crw_NAgaps, DM = DM)Now we can fit the HMM with covariate-dependent observation parameters.
# fit model
m_tod_obs <- fitHMM(prep_tracks_gps_crw_NAgaps,
nbStates = 2,
dist = dist,
Par0 = Par0_m3$Par,
beta0 = Par0_m3$beta,
DM = DM)
m_tod_obs## Value of the maximum log-likelihood: -17849.32
##
##
## Regression coeffs for step parameters:
## --------------------------------------
## mean_1:(Intercept) mean_1:cosinorCos(tod, period = 24)
## [1,] 5.665086 -0.02765113
## mean_1:cosinorSin(tod, period = 24) mean_2:(Intercept)
## [1,] 0.0438113 6.596116
## mean_2:cosinorCos(tod, period = 24) mean_2:cosinorSin(tod, period = 24)
## [1,] -0.1421544 0.1260971
## sd_1:(Intercept) sd_2:(Intercept)
## [1,] 5.077695 5.467302
##
## step parameters (based on mean covariate values):
## -------------------------------------------------
## state 1 state 2
## mean 297.9917 854.4029
## sd 160.4039 236.8204
##
## angle parameters:
## -----------------
## state 1 state 2
## mean 0.000000 0.00000
## concentration 2.345858 18.60938
##
## Regression coeffs for the transition probabilities:
## ---------------------------------------------------
## 1 -> 2 2 -> 1
## (Intercept) -2.673261 -2.586582
##
## Transition probability matrix:
## ------------------------------
## state 1 state 2
## state 1 0.93543027 0.06456973
## state 2 0.07000697 0.92999303
##
## Initial distribution:
## ---------------------
## state 1 state 2
## 0.2959291 0.7040709
The regression coefficients are difficult to interpret for a cyclic function, but we can view the estimated relationship for each state with confidence intervals. Note, when the model has covariates, you can include plotCI = T to include the 95% confidence intervals on the estimated value.
plot(m_tod_obs, ask = F, plotCI = T, plotTracks = F)## Decoding state sequence... DONE
This shows that the mean step within each state varies based on time of day, with higher step lengths in the morning. State 1 has a mean ranging from roughly 280-300m and state 2 ranges from roughly 600-900m.
As the observation data is the same between these models and the model in section 4.2, we can use AIC to compare models with and without time of day effects.
AIC(HMM_gps_crw_NAgaps) # without time of day
## [1] 35787.81
AIC(m_tod_tpm) # time of day effect on transition probabilities
## [1] 35792.69
AIC(m_tod_obs) # time of day effect on observation parameters
## [1] 35724.65This indicates the the model with a time-varying mean step length is the best fitting model.